Data allocation in a Heterogeneous Disk Array (HDA) with multiple RAID levels for database applications
نویسندگان
چکیده
We consider the allocation of Virtual Arrays (VAs) in a Heterogeneous Disk Array (HDA). Each VA holds groups of related objects and datasets such as files, relational tables, which has similar performance and availability characteristics. We evaluate single-pass data allocation methods for HDA using a synthetic stream of allocation requests, where each VA is characterized by its RAID level, disk loads and space requirements. The goal is to maximize the number of allocated VAs and maintain high disk bandwidth and capacity utilization, while balancing disk loads. Although only RAID1 (basic mirroring) and RAID5 (rotated parity arrays) are considered in the experimental study, we develop the analysis required to estimate disk loads for other RAID levels. Since VA loads vary significantly over time, the VA allocation is carried out at the peak load period, while ensuring that disk bandwidth is not exceeded at other high load periods. Experimental results with a synthetic stream of allocation requests show that allocation methods minimizing the maximum disk bandwidth and capacity utilization or their variance across all disks yield the maximum number of allocated VAs. HDA saves disk bandwidth, since a single RAID level accommodating the most stringent availability requirements for a small subset of objects would incur an unnecessarily high overhead for updating check blocks or data replicas for all objects. The number of allocated VAs can be increased by adopting the clustered RAID5 paradigm, which exploits the tradeoff between redundancy and bandwidth utilization. Since rebuild can be carried out at the level of individual VAs, prioritizing rebuild of VAs with higher access rates can improve overall performance.
منابع مشابه
Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!)
Row-Diagonal Parity (RDP) is a new algorithm for protecting against double disk failures. It stores all data unencoded, and uses only exclusive-or operations to compute parity. RDP is provably optimal in computational complexity, both during construction and reconstruction. Like other algorithms, it is optimal in the amount of redundant information stored and accessed. RDP works within a single...
متن کاملDynamic Multiple Parity (DMP) Disk Array for Serial Transaction Processing
ÐThe performance of today's database systems is usually limited by the speed of their I/O devices. Fast I/O systems can be built from an array of low cost disks working in parallel. This kind of disk architecture is called RAID (Redundant Arrays of Inexpensive Disks). RAID promises improvement over SLED (Single Large Expensive Disks) in performance, reliability, power consumption, and scalabili...
متن کاملReliable Cluster Computing with a New Checkpointing RAID-x Architecture
In a serverless cluster of PCs or workstations, the cluster must allow remote file accesses or parallel I/O directly performed over disks distributed to all client nodes. We introduce a new distributed disk array, called the RAID-x, for use in serverless clusters. The RAID-x architecture is based on an orthogonal striping and mirroring (OSM) scheme, which exploits full-bandwidth and protects th...
متن کاملStriping Policies in Multiclass Disk Arrays Ph.d. Thesis Proposal
Redundant Arrays of Inexpensive Disks (RAID) provide data striping for improved performance and redundancy for increased reliability. Workloads utilizing RAID disk arrays have been divided into two categories, those characterized by large, sequential accesses and those characterized by small, random accesses, typically denoted as scientiic applications and on{line transaction processing (OLTP) ...
متن کاملPerformance of Future Database Systems: Bottlenecks and Bonananzas
Current trends in database systems include the incorporation of parallel processing; objectrelational capabilities; support for data warehousing, data mining, and OLAP. The typical hardware systems on which such database systems are being implemented include SMP’s, MPP’s, clusters, 64bit processors, disk caches, and RAID and other high availability configurations. In addition, as database techn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comput. Syst. Sci. Eng.
دوره 31 شماره
صفحات -
تاریخ انتشار 2016